Yore
A Rust library for decoding and encoding character sets based on OEM code pages.
Features
- Fast performance *
- Minimal memory usage with
Cow
andshrink_to_fit
- Easy-to-use API
- Broad range of supported code pages
- Handles code pages with redefined ASCII characters (<0x80), such as '٪' in CP864
Usage
Add yore
to your Cargo.toml
file.
[]
= "1.1.0"
Examples
Using a specific code page
use ;
use ;
// Vec contains ASCII "text"
let bytes = vec!;
// Vec contains ASCII "text " and codepoint 231
let bytes_undefined = vec!;
// Notice that decoding CP850 can't fail because it is completely defined
assert_eq!;
// However, CP857 can fail
assert_eq!;
// "text " + codepoint 231
assert!;
// Lossy decoding won't fail due to fallback
assert_eq!;
// Encoding
assert_eq!;
assert!;
assert_eq!;
Using a trait object
use CodePage;
Supported code pages
Identifier | Name | Description |
---|---|---|
437 | ibm437 | OEM United States |
737 | ibm737 | OEM Greek (formerly 437G); Greek (DOS) |
775 | ibm775 | OEM Baltic; Baltic (DOS) |
850 | ibm850 | OEM Multilingual Latin 1; Western European (DOS) |
852 | ibm852 | OEM Latin 2; Central European (DOS) |
855 | ibm855 | OEM Cyrillic (primarily Russian) |
857 | ibm857 | OEM Turkish; Turkish (DOS) |
860 | ibm860 | OEM Portuguese; Portuguese (DOS) |
861 | ibm861 | OEM Icelandic; Icelandic (DOS) |
862 | dos-862 | OEM Hebrew; Hebrew (DOS) |
863 | ibm863 | OEM French Canadian; French Canadian (DOS) |
864 | ibm864 | OEM Arabic; Arabic (864) |
865 | ibm865 | OEM Nordic; Nordic (DOS) |
866 | cp866 | OEM Russian; Cyrillic (DOS) |
869 | ibm869 | OEM Modern Greek; Greek, Modern (DOS) |
874 | windows-874 | Thai (Windows) |
910 | ibm910 | IBM-PC APL2 |
1250 | windows-1250 | ANSI Central European; Central European (Windows) |
1251 | windows-1251 | ANSI Cyrillic; Cyrillic (Windows) |
1252 | windows-1252 | ANSI Latin 1; Western European (Windows) |
1253 | windows-1253 | ANSI Greek; Greek (Windows) |
1254 | windows-1254 | ANSI Turkish; Turkish (Windows) |
1255 | windows-1255 | ANSI Hebrew; Hebrew (Windows) |
1256 | windows-1256 | ANSI Arabic; Arabic (Windows) |
1257 | windows-1257 | ANSI Baltic; Baltic (Windows) |
1258 | windows-1258 | ANSI/OEM Vietnamese; Vietnamese (Windows) |
* Benchmarks
encoding_rs
supports only a few of the encodings that oem_cp
and yore
support. Additionally, encoding_rs
focuses on streaming use cases.
Refer to the bench crate for more details.